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Abstract 

We address the problem of downhill protein folding in the framework of a simple statistical me- 
chanical model, which allows an exact solution for the equilibrium and a semi-analytical treatment 
of the kinetics. Focusing on protein IBBL, a candidate for downhill folding behavior, and com- 
paring it to the WW-domain of protein PINl, a two-state folder of comparable size, we show that 
there are qualitative differences in both the equilibrium and kinetic properties of the two molecules. 
However, the barrierless scenario which would be expected if IBBL were a true downhill folder, is 
observed only at low enough temperature. 
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I. INTRODUCTION 



The folding of small single-domain proteins is often described as a two-state process, 
where a molecule starts from, e.g., a disordered thermodynamic state and evolves towards its 
ordered (native) thermodynamic state with a kinetics characterized by a single exponential 
behavior. If a suitable reaction coordinate can be identified, and the corresponding free 
energy profile can be computed, one expects, in the vicinity of the denaturation temperature, 
a profile with two minima separated by a barrier. 

In recent years it has however been suggested [l| that there are proteins whose kinetics, 
at all temperature, might not be hindered by a free energy barrier, thus reproducing the 
"downhill folding" scenario suggested in Typical features of a downhill folder would be a 
continuous variation of the thermodynamic state and nonexponential time behavior. Protein 
IBBL has been thoroughly investigated by Munoz and coworkers and 
several results seem to indicate that it is one such downhill folder, though they have been 
questioned by other authors [lO, 
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15|. 



Purpose of the present paper is to investigate the folding behavior of IBBL, compared 
to the WW-domain of protein PINl, a clear two-state folder of comparable size, in the 
framework of a simple statistical mechanical model, which allows an exact solution of the 



equilibrium and a semi-analytica 



Wako-Saito (WS) model (l6|. 
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le kinetics. We are going to consider the 
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271], a one-dimensional 



model with binary degrees of freedom and long-range, many-body interactions, which has 



found applications even outside the protein folding domain 28|, |2t 



3- 



The WS model 



has already been applied to this problem in ]|, where an approximate solution for the 



equilibrium was used. In the present paper we shal 
equilibrium, following the approach described in 31 



make use of the exact solution for the 



32|, while for the study of the kinetics 



33, 



3^. 



we shall resort to Monte Carlo simulations and the local equilibrium approach 

The plan of the paper is as follows: in Sec. [Ill we shall describe the WS model, then 
the equilibrium behavior of the BBL and PINl molecules will be studied in Sec. IIIII after 
a careful discussion about the choice of model parameters. Kinetics will be discussed in 
Sec. llVt and in Sec. |V]we shall analyze the molecule's behavior from the perspective of free 
energy profiles. Finally, our conclusions will be drawn in Sec. IVII 
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II. THE MODEL 



The WS model was introduced in 1978 by Wako and Saito [ly, [l7|. Yet, it was largely 
unknown to the wide community of researchers in protein foldin g w hen, about two decades 
later, it was independently reintroduced by Eaton and coworkers |l8l. Il9l. l20l. l25l]. with minor 



differences with respect to the former model, as a simple and efficient theoretical tool to 
interpret their experimental data. Thanks also to this experiment-oriented approach, the 



model achieved some popu 



purposes 
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arity and was then used by many different authors for several 



27|, even in a problem of strained epitaxy 
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following we shall use our exact solution of the equilibrium thermodynamics 



31 



30|. In the 
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34| 



32| and 



our local equilibrium approach (LEA) to the kinetics 

The model is a Go-like model ^] which considers a protein as a sequence of + 1 
aminoacids connected by C-N peptide bonds. The degrees of freedom of the model are 
associated to the peptide bonds and will be denoted by m^, k = 1,2,...N. These 
are binary variables which take values in {0, 1}. In the case rrik = 1 the k-th peptide 
bonds is assumed to be in a native-like (or ordered) conformation, while rrik = stands 
for an unfolded (or disordered) conformation. For each peptide bond the set of unfolded 
conformations is of course larger than the native one, and this is taken into account in an 
effective way by introducing the entropy cost > of ordering bond k. In [36] it is shown 
how to give an explicit realization of this entropy cost in terms of microscopic degrees of 
freedom. The main feature of the model is that two aminoacids interact only if they are 
in contact in the native state (non-native interactions are neglected, in the spirit of Go- 
like models) and if all the peptide bonds between them are in the native state (that is, 
the corresponding dihedral angles 0, assume their native values). The latter is a drastic 
assumption which makes the model amenable to analytic treatments, up to the exact solution 
of the equilibrium. Nevertheless, the model has been shown to give realistic results for the 
kinetics of protein folding. 

The effective free energy (sometimes called "effective Hamiltonian" in the physics litera- 
ture) of the model can be written as 

N-l N j N 

= ^ 5Z Yl^k ~ RT^ qk{l - ruk), (1) 

i=l j=i+l k=i k=l 

where R is the gas constant and T the absolute temperature. A is the contact matrix, and 



its element takes value 1 if aminoacids i and j + 1 are in contact in the native state 
(that is, if they have at least a pair of atoms closer than 0.4 nm according to the structure 
deposited in the Protein Data Bank 37|) and otherwise. The corresponding contact 



energy ej , < is defined as in 



20] as —ke if the number of atom pairs in contacts 



satisfies 5{k — 1) < n.^^ < 5k. Here e is an energy scale which is determined, together with 
the entropic costs qt, as described in the next section. 



III. EQUILIBRIUM 



The first exact solution of the equilibrium was already reported in ly, Il7| and then 



went forgotten until, as far as we know, the original 



meanwhile, we had developed our approach 



31 



papers were cited again in 



. In the 



32l | to this exact solution, which we shall 



follow in this paper. It relies on a mapping to a two-dimensional problem, which stems 

j 

from the introduction of the new binary variables Xij = Y\''^k ioi 1 < i < j < N, which 

k=i 

take value 1 only if all the peptide bonds belonging to the stretch from i to j are in the 
native state, and otherwise. In the case i = j we have of course Xi^i = rrii. These new 
variables are apparently non-interacting, since the effective free energy Eq. ([T]) is a linear 
function of them, but they actually interact through the constraints Xij = Xi+ijXij-i which 
they must satisfy. Our new variables can be associated to a triangle-shaped portion of a 
two-dimensional (square) lattice and the model can be easily solved by transfer matrix, 
since due to the constraints the matrices involved are at most of rank A^. In [31] it is shown 
how to compute the partition function, the free energy as a function of the number of native 
peptide bonds and relevant expectation values. 

Before proceeding to describe the equilibrium properties of IBBL, we discuss the choice 
of the model parameters. Let us first of all consider the contact map. Different model 
proteins have been considered in the literature, corresponding to the same core sequence 
with different choices of N and C terminal parts. For instance, the protein corresponding to 

n n 

PDB code IBBL has been used in [1[, while 1W4H has been used in [10|. The corresponding 
sequences differ only in the end parts, and have a 45 residue identical subsequence, which 
goes from residue 7 to 51 for IBBL and from 126 to 170 for 1W4H; the relative native 
structures have disordered terminal parts. Since the theoretical model we use requires the 
knowledge of the atomic coordinates of the protein in the native structure, we consider the 
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FIG. 1: Average fraction of folded molecules as a function of temperature (in Kelvin). See text for 
the discussion of the meaning of the various lines. 

longest common subsequence with resolved structure, and take residues 12 to 48 in IBBL , 
together with their atomic coordinates from the file IBBL.pdb. 

We can now move on to the choice of the parameters e and g^. The entropic costs have 



often been chosen as in 



3l| . and take values qn for the more structured parts of the 
molecule (the peptide bonds preceeding a residue marked by B, E, G, H, I or T according 
to the DSSP classification [38-]), and qc for the remaining, less structured parts. To begin 
with we take qn = 1-66 and qc = 0.6 as in 20|. 

The energy scale e is then determined by imposing the condition that at the experimental 
denaturation temperature the fraction p of folded molecules is 1/2. Several estimates of the 
denaturation temperature have been reported. To begin with, we consider Tm = 329 K, 
which is consistent with the estimates in [lOj] (this choice will be refined in the following). 
In order to give an estimate of p we introduce the number of native peptide bonds M = 



'^k=i ''^k and the average fraction 



(M) 
N 



of such bonds, m takes the value 1 at zero 



temperature and rric 



1 ^, 



at infinite temperature. A good definition of the 



fc=i 



m - moo 



fraction of folded molecules is then p = —, where mo is a value which represents 

mo - moo 

well the folded state. For IBBL we can choose mo = m(T = 0) = 1. The energy scale e has 
therefore to satisfy p{Tm) = 1/2. In this way we obtain e/R = 99.8 K and the temperature 
dependence of p is shown in Fig. [1] (solid line). 

In order to simplify our parameter choice we can try to remove the distinction between 
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FIG. 2: Specific heat for protein IBBL, vs temperature, for the same values of q as in Fig. [H T is 
reported in Kelvin, specific heat C in Kcal/ (K mole), so that the plotted quantity is adimensional. 

more and less structured parts of the molecule (notice that the heterogeneity of the system 
is not removed, it remains encoded in Aij) and, keeping e fixed to the above value, look for 
a unique value qk = q, k = 1,2, ■■■ N such that p{Tm) = 1/2. We obtain q = 1.287 and 
the dashed line in Fig. [H Since we do not aim at reproducing in detail the experimental 
results, but just to grasp the basic features of the folding of IBBL with a minimal model, 
we consider that the changes introduced by the simplification are negligible and from now 
on we consider a single parameter q. 

The effect of q on the temperature dependence of p is clearly seen in Fig. [T] where two 
other values have been considered, and e has been adjusted every time so that p(Tm) = 1/2. 
Choosing g = ln2, as in 36|], we have e/R = 73.0 K and the dotted line, while for q = 2, 
as previously chosen for a model antiparallel /3-sheet j33], we obtain e/R = 137 K and the 
dash-dotted line. It is clearly evident that e and, as a consequence, the sharpness of the 
transition, increase with q. The reinforcement of the transition for increasing q (and e) is 
also seen in Fig. O where the temperature dependence of the specific heat is shown. 

In order to determine a pair (e, q) for our analysis we try to fit the experimental results 
for the native fraction as a function of temperature (from circular dichroism and NMR) 



m 



10|, Fig. 2(b). We infer the set of data from the coordinates reported in the figure 
itself, and fit them with theoretical curves by our model, obtaining, as an estimate for 
our parameters, the values q = 1.589 and e/R = 114 K, which we shall use from now 
on. These correspond to Tm = 326.2 K, which is consistent with the estimates reported in 
|lol | . The fit is reported in Fig. [3l The quality of the fit is not optimal, especially at high 
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FIG. 3: Average fraction of folded molecules as a function of temperature. Fit to experimental 
data. 

temperatures, where apparently there is a change in the unfolded minimum as T increases. 
It could probably be improved by introducing enough heterogeneity in the interaction and 
entropy parameters e^j- and g^, but this would introduce a huge number of parameters to 
be fitted, in the end at the expenses of simplicity and of a clear understanding of the basic 
features underlying the folding of IBBL. Alternatively, we have checked (data not shown) 
that improving quality at high temperature by setting e/R = 140 K and q = 2.075 (to 
ensure the same value of T^), at the expenses of the low temperature region, introduce 
minor changes on the free-energy profiles reported in Section |Vl On the other hand, we 
have also noticed that if we blindly assumed that at T=373 K the protein is completely 
denaturated, using m(T = 373K) instead of rrioo as the baseline for the denatured state in 
the definition of p, then it is possible to get a much better fit of the data, by only adjusting 
q to q = 1.580: a small change that leaves the profiles (see Sec. |Vl) almost unchanged. We 
mention this here to stress how the choice of the baselines can dramatically affect the quality 
of the fits; however, in the following we go back to our first choice of parameters, and to 
the original and correct choice of moo as the true baseline, just noticing that Fig. [3] suggests 
that our model behaves in a less cooperative way than the true protein: we will take this 
into account when discussing our results for kinetics and profiles, in the next sections. 

The same procedure above is carried out for the WW-domain of protein PINl (pdb code 
1I6C; for simplicity in the following we will often refer to it as PINl), which is a clear two- 
state folder of a size very close to the IBBL one (38 peptide bonds instead of 36) and will 
be used for comparison throughout the paper. By using = 332 K and fitting to the data 
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FIG. 4: Same as Fig. [3] for protein the WW-domain of protein PINl. Inset: plot of the fraction of 
native peptide units, m, over a wider temperature range. Notice that a short tail of the protein, 
corresponding to the residues having little structure in the native state, denature at very low 
temperature. We disregard this behavior and define p to account for the jump around Tm=332 K, 
where the protein unfolds completely and cooperatively. 



m 



391], Fig. 3(d), we obtain q = 1.185 and e/R = 60 K. See Fig. |4]for the corresponding fit. 



Here the reference value mo for the folded state has been chosen as mo = m(T = 273K) < 1, 
since within this model the molecule orders perfectly only at temperatures T < 50 K, while 
a wide plateau in m(T) is observed in the range 200 to 300 K (see inset). 

From the above results, we can see a first difference between IBBL and PINl, in that 
the experimental results concerning the latter can be better fitted than those relative to the 
former, when the same level of description is used for the two systems. That is, as long 



as only the geometric aspects of the native states are considered, by imposing that e 



and Qi = q for all i and j, the model performs better in reproducing the aX\-(3 protein Pinl 
WW-domain rather than the helical one IBBL, and the latter appears to be less cooperative 
in the model than in reality. 

We conclude this section by observing that in any case from the equilibrium results it 
is difficult to extract reliable information about the two-state vs. downhill behavior of the 
molecules, due to the fact that observables like the average fraction of native residues, i.e. 
those that correspond to the first moment of the probability distribution, cannot distinguish 
between unimodal and multimodal distributions, and can assume the same value in the case 
the underlying distribution is that of a two-state folder as well as in the case of a downhill 
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one. In principle, specific heat conveys more information, and a measure of cooperativity is 
represented by the ratio k of van't Hoff to calorimetric enthalpy (see Ref. 40|] for a critical 
discussion of several definition of this parameter). However, also in that case, it is not easy 
to put a clear threshold between two-state, cooperative behavior, that ideally corresponds 
to K = 1, and less cooperative one, for n < 1. 

To obtain a deeper insight on the folding mechanism of the two proteins, more detailed 
studies of the kinetics and of the free energy profiles are needed, and will be done in the 
next sections. 



IV. KINETICS 



In the present section we shall study the time behavior of our model when it is subject 



to a simple 
discussed in 



Metropolis kinetics (the effect of different choices for the kinetics has been 
33]). More precisely, we consider a single "bond-flip" kinetics, defined by the 



discrete-time master equation: 



Pt+i{x) = '^W{x' — > x)pt{x) 



(2) 



where x = {xij, 1 < i < j < N} denotes a configuration which satisfies the constraints 
described in the previous section, pt{x) is the probability of configuration x at time t. The 
transition probability W{x ^ x') is assumed to vanish if x and x' differ by more than one 
peptide bond, is given by: 



W{x 



X 



N 



min < 1 , exp 



H{x') - H{x) 
RT 



X) 



(3) 

x') by 



if X and x' differ by exactly one peptide bond and W{x 
normalization. 

The kinetics can be studied by two different a ppr oaches, namely direct Monte Carlo 



(MC) simulations and the LEA developed in [33|, |3J], where it is assumed that the sys- 
tem probability evolves in a restricted space given by those probabilities which satisfy the 
same factorization property as the equilibrium one (these can be thought of as equilibrium 
probabilities of models with the same WS structure but different values of the parameters). 



It has been shown in 
comparable to IBBL. 



33] that this approximation is quite accurate for proteins of a size 
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FIG. 5: Average fraction of native peptide bonds vs. time in LEA (solid lines) and single exponential 
fits (dashed lines) for protein IBBL. 

In Fig. [5] we report the average fraction of native bonds m as a function of time (solid 
line) at the temperature T = 329 K and at two lower temperatures, 309 K and 289 K. The 
initial condition is disordered, corresponding to equilibrium at infinite temperature. Single 
exponential fits m(t) = moo ~ aexp(— t/r) are also reported and the equilibration times are 
r = 2.64 X 10^, 1.95 x 10^ and 7.56 x 10^ for T = 329, 309 and 289 K respectively; time unit, 
here and in the following, is the elementary time corresponding to proposing a change of 
the state of a peptide bond. It is clearly seen that the deviations from a single exponential 
behavior are more relevant for smaller temperatures, that is when r is smaller. Notice that 
the t axis extends over a duration slightly longer than the longest r. 

For comparison, we have repeated the same analysis for protein PINl, and the corre- 
sponding results are reported in Fig. [6l We considered temperatures T =332, 312 and 292 
K, the equilibration times being r = 1.31 x 10^, 1.06 x 10^ and 5.61 x 10^ respectively. These 



characteristic times have been computed, within the present model, in [33| for a wide range 
of temperatures, and turned out to reproduce quite well the experimental data. Observe 
that IBBL is considerably faster than PINl, roughly 5 times at the denaturation tempera- 
ture. Here the deviations from the single exponential behavior are much less relevant (notice 
that here the t axis extends over a duration of order l/40th of the largest r). 

It has been suggested i2i| that downhill folding can imply a nonexponential time behavior. 
Although the large time single exponential behavior is formally a direct consequence of 
Eq. [2] one might argue that the much larger deviations we observe for IBBL could be a 
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FIG. 6: Same as Fig. [5] for protein PINl. 



signature of a potential downhill folding behavior. Several types of analytical time behavior, 
as alternatives to the single exponential, have been proposed for downhill folders, including 
multiexponential and stretched exponential 41j, which are actually related to each other 42|. 
In order to test these ideas we made multiexponential fits, using up to three exponentials, 
to our results. 

The fits are made with the following procedure: the relaxation rates obtained previously 
from equilibrium (long time) analysis are factored out from the kinetics data for m{t), and 
a fit is performed according to 



(m(t) — m(oo)) e 



kt 



+ be-"', 



(4) 



or 

(m(t) - m(oo)) e'=* = a + te"^* + ce"'^*, (5) 

so that the second and third smallest rates are k+u and k+fi. Results for the two-exponential 
fit are reported on Table [H 

From these we notice that the dominant correction to the main exponential behavior 
decays with a characteristic time which is of the same order (at low temperatures) or up to 
two orders of magnitude (at higher T) smaller than r for IBBL; on the contrary, PINl WW- 
domain exhibits a time scale separation of 4-5 orders of magnitude. The above remarks hold 
true also for the three-exponential fits: in the case of IBBL, the introduction of u induces 
an adjustment of /i with respect to the value previously found, but both u and /i stay within 
the two-orders of magnitude range found above. For the PINl WW-domain one gets the 
same n , while the third rate k + oo represents just a small correction of the relaxation rate k 



11 



IBBL WW-domain 



T[K] 


k [l/Tf] 




T[K] 


k[l/r/] 


M [l/r/] 


289 
309 
329 


1.3 X 10"'' 
5.08 X 10"^ 
3.78 X 10-5 


4.31 X 10^5 
4.6 X 10-5 
2.87 X 10^3 


292 
312 
332 


1.78 X 10-5 
9.47 X 10-*^ 
7.64 X 10-6 


1.55 X 10-2 
1.46 X 10-2 
1.36 X 10-2 



TABLE I: Results for the two-exponential fit of the rates for IBBL and PINl WW-domain. See 
text for discussion. Here r/ is the elementary time associated to the flip of the state of a peptide 
unit. 

(data not shown), and strongly correlates with it, signalling that the two-exponential fit is 
enough to grasp the essential features of the spectrum. The difference in the rates is a further 
confirmation that IBBL deviates more than PINl WW-domain from the single-exponential 
time-behavior: since the existence of a relevant barrier induces a separation of time scales, 
the small gap in the rates for IBBL suggests that the barrier is not very pronounced in this 
case. It is interesting to notice that the above findings are in agreement with the results 



for a different fast-folding protein, Ae-ss, studied in Ref. [43|, where the observed departure 
from simply-exponential kinetics can be described by a Langevin dynamics on a suitable 
double well potential with a small barrier. 

Before accepting the above arguments it is however important to check that the above 
results are not artifacts of the local equilibrium approximation. An exact solution of the 
kinetics is not feasible, but MC simulation can provide a reference result if we average over 
a large enough number of trajectories (which is much more time consuming than LEA). We 
have therefore compared the LEA results at the lowest temperatures, both for IBBL and 
PINl, with MC results, averaging over 10^ trajectories for IBBL (Fig. [7]) and 10^ trajectories 
for PINl (Fig. ED . 

One can see that our conclusions about the deviation from the single exponential behavior 



are confirmed by MC simulations. 



lower bounds of the exact ones 



33 



!n addition, we recall that LEA characteristic times are 
34| and indeed our MC simulations give r ~ 9.74 x 10^ 



for IBBL and r ~ 6.30 x 10^ for PINl. 

It is also interesting to take a look to single MC trajectories, which might be representative 



12 



1 

0.8 
0.6 

a 

0.4 
0.2 h 

q\ ^ 1 ^ 1 ^ 1 ^ 1 , 1 ^ 1 

5000 10000 15000 20000 25000 30000 
t 

FIG. 7: Average fraction of native bonds as a function of time for IBBL at T = 289 K. Solid line: 
MC, dashed line: single exponential fit to MC, dotted line: LEA. 
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FIG. 8: Same as Fig. [7] for PINl at T = 292 K. 

of the behavior of single molecules. In Fig.[9]we report the time behavior of the fraction M/N 
of native peptide bonds in a typical Monte Carlo simulation at a temperature T = 289 K, well 
below the denaturation temperature, starting from a disordered state. It is clearly evident 
that the behavior resembles that of a two-state system, with an almost saturated ordered 
state and a rather broad disordered state, characterized by large fluctuations. Similar results 
are obtained at temperatures closer to T^, where of course the ordered state fluctuates more 
and the system switches repeatedly between the two states. 

In order to observe a different, continuous-like behavior, one has to go to even smaller 
temperatures, for instance T = 269 K (Fig. [TUl) . Notice that at the same temperature PINl 
still has a clear two-state behavior, which persists down to 170 K. 

Aiming to get a deeper understanding of these results and to complete our analysis of 
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FIG. 9: Fraction of native peptide bonds vs. time in a Monte Carlo simulation, starting from a 
disordered configuration, at T = 289 K. 
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FIG. 10: Same as Fig. EJ at T = 269 K. 

the IBBL behavior, we move on to the study of its free energy profile as a function of the 
number of native peptide bonds, which will be the subject of the next section. 

V. FREE ENERGY PROFILES 

The equilibrium free energy profile F{M) as a function of t 
bonds can be easily computed in an exact way as shown in 
in Fig. [11] for various temperatures. 

For comparison, we report the same quantity for the WW-domain of protein PINl in 
Fig. [121 One can see that both molecules can exhibit a barrier in their profile, though IBBL 
has a much smaller barrier, which indeed disappears at low enough temperature, and a 




le number M of native peptide 
311, and the result is reported 
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FIG. 11: The IBBL free energy profile F[M) for various temperatures. 
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FIG. 12: The PINl free energy profile F{M) for various temperatures. 

much wider unfolded minimum with respect to PINl, whose free energy profile exhibits a 
well definite two-state structure. At T = 289 K, the wide plateau-like region in the range 
M ~ 10 to 25 for IBBL corresponds approximately to the wide fluctuations observed in 
the single MC trajectory. Notice that upon raising the temperature this fiat region moves 
towards more unfolded states, and this shift gives rise to the long tail we can see in Fig. [3l 
while a barrier appears in the vicinity of the native state and then move towards it. On the 
other end, at T = 269 K, one gets an almost linear profile, decreasing with M . In order to 
get a similar profile for PINl one should go to much lower temperatures. It is not possible to 
define a sharp threshold, but the value of 170 K obtained in the previous section is certainly 
a good estimate. 

It is also interesting to compare the barrier heights observed above with the rates com- 
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FIG. 13: Natural logarithm of the folding, unfolding and relaxation rates of protein IBBL and 
of the WW-domain of PINl, vs 1/T. Rates are measured in units of the inverse of the elementary 
time, corrisponding to flipping the state of one peptide-bond. Temperatures are in Kelvin. 

puted in the previous section. 

Indeed, if (and only if) we assume a two-state behavior, it is possible to define folding 
and unfolding rates kf and ku, that are necessarily related to the measured equilibration 
rate k and average fraction of native peptide units m by the equations k = kf + ku and 
kf/ku = — p). Figure [13] reports the results for the folding, unfolding and global 
equilibration rate for IBBL and PINl: it can be noticed that values of kf increasing with 
temperature appear in both proteins, signalling the region were the two-state assumption 
fails. However, for the WW-domain this happen just at temperatures greater than 367 K, 
well above the mid-point of the transition, while for IBBL kf presents a minimum at T=340 
K, rather close to Tm=326.2 K. Then, assuming the Arrhenius-like behavior involved by the 
two-state hypothesis, we looked for a linear correlation between the natural logarithm of 
the folding rate In/cj and the height of the folding barrier divided by RT. We find that the 
linear relationship between In kf and AF^^u /{RT) (where f indicates the barrier top and U 
the unfolded minimum) exists only for values of the barrier below 3.8 R T approximately 
(corresponding to the region of temperatures 302 - 340 K); then the folding rate would 
present a minimum and would start to increase for increasing barrier, signalling that the two- 
state hypothesis cannot be applied any more. For PINl, we find that the linear correlation 
is very good in the larger range of temperatures 296-350 K. The linear fits reveal that for 
IBBL the absolute value of the slope is 0.6, while for PINl it is 0.85 (data not reported), 
which is again an indication of the bigger deviation of IBBL from two-state behavior. The 
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fact that the slope is not 1 even for PINl is probably related to the above-mentioned small 
overestimation of the rates by LEA. In summary, based on the above results, one cannot 
regard IBBL as a true downhill folder, since the typical features of a downhill folder appear 
only at low enough temperatures, and not in the whole temperature range. It is however 
true that its behavior differs qualitatively from that of PINl WW-domain, since to get a 
similar behavior for the latter one would have to go down to unphysical temperatures. 

VI. CONCLUSIONS 

We have analyzed the folding behavior of protein IBBL, which has been proposed as 
a downhill folder, compared to that of the WW-domain of protein PINl, which is a two- 
state folder, in the framework of the WS model, a statistical mechanical Go-like model with 
binary variables. The model has been solved exactly at equilibrium, while for the kinetics we 
have used Monte Carlo simulations and the semi-analytical local equilibrium approach. The 
model parameters have been determined by means of an analysis which indicates that the 
distinction which is usually made, between the entropic contributions of the more structured 
and the less structured parts of the chain, is not crucial in determining the overall behavior 
of the system. This result allows to reduce the number of parameters and can be useful also 
in future studies based on the same model. 

Comparing the behavior of the two molecules we have seen that IBBL departs from a 
clear two-state behavior in several respects. Deviations from a single exponential behavior 
are more relevant than in the case of PINl, single MC trajectories can differ markedly from 
those of a two-state folder at low enough temperature, and free energy profiles exhibit a 
very small barrier, which vanishes upon lowering the temperature. All these features would 
indeed agree with the usual expectations for a downhill folder if they were exhibited in a 
wide temperature range, and not just at very low temperature. However, we do find a small 
barrier in the free-energy profile around the temperature of the transition midpoint T^, as 
measured by half the variation of the fraction of folded residues between the two asymptotic 
values of low and high temperatures (i.e. the native and unfolded baselines). This prevents 
us from considering IBBL as a true downhill folder, and suggests that the picture of a fast, 
but barrier-limited, folding applies at least in the region around T^. On top of this, one 
should consider that, at this extremely simplified level of description, our model appears 
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to be less cooperative than the real protein, as Fig. [3] suggests, so that the similarity to a 
two-state folder could be also more pronounced in reality. 

One could, of course, argue that even PINl can exhibit the same phenomena as IBBL, 
albeit at unphysically low temperatures. Indeed, as it has been already recognized in the 
literature, two-state and downhill folders do not make up two well separate classes, they 
are just two extreme cases of a range of possible behaviors. It is however true that the 
crossover between these two different behaviors occurs, at least in the present model, in a 
temperature range which is not biologically relevant for PINl, while it is close to physiological 
for IBBL. Therefore, our results can be regarded as a confirmation that IBBL cannot either 
be considered a clear two-state folder. It is rather an example of a molecule which crosses 
over from a weak two-state behavior at the denaturation temperature to a downhill folding 
behavior at lower temperatures. 

Our results are apparently at odds with the theoretical results reported by Wang and 



coworkers 
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451], that use Langevin dynamics of a Go-like model to study IBBL, and 



find that the downhill scenario holds at all temperatures within their modelling scheme. 
However, as they point out, different folding behavior appears to be triggered by the average 
number of non-local contacts per residue, and this could explain the differences between 
their and our results, since we resort to a different definition of the contact map, with all 
atom-atom contacts, since it allows a better fit of the parameters of the WS model to the 
experimental data. On the other hand, we find that this choice increases the ratio of non- 
local to local contacts: we have that the average number of non-local contacts per residue 
is Nn = 1.65, which is above the threshold that Wang and coworkers report for two-state 
behavior. This also propose, indeed, a word of caution about the model-dependent features 
of theoretical results, along the very same lines of the results reported in \4^, where an 
analysis is performed of the influence of the cutoff distance, used to define the contact map, 
on the degree of cooperativity of the folding transition. The robust features that appear 
from our results are that protein IBBL is indeed less cooperative that the WW-domain of 
protein PINl, and departs significantly from a clear two-state behavior. 
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